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Project Overview 


The research project supported by this grant has become known informally as the PISCES 
Project (an acronym for Parallel Implementation of Scientific Computing Environments). The 
long term objective of the PISCES project is to develop better ways to evaluate the new parallel 
computer systems that are coming on the market. The particular target is evaluation of the effec- 
tiveness of these parallel systems for the large- scale scientific software systems of interest to 
NASA. The ultimate goal of the work is to influence the design of the next generation of paral- 
lel systems to better meet the needs of NASA software systems. 

The major problem areas in evaluating the emerging commercial parallel machines lie in 
several facts. First, the large variety of different parallel architectures available makes evalua- 
tion difficult. To reprogram several large NASA systems for each different architecture in order 
to evaluate performance differences would be expensive and time-consuming. 

Second, evaluation of these machines should be in the context of the same sort of program- 
ming environments, using Fortran or similar high-level languages, that have traditionally been 
available on sequential machines. These software layers have an impact on performance that is 
important to evaluate. 

Third, the performance of parallel machines on large NASA software systems is the most 
important basis for evaluation. Performance on small test programs or on particular algorithms is 
not of as much interest. 

Fourth, the evaluation must take into account the amount of reprogramming of the existing 
NASA software base that would be required to effectively use these parallel machines. Most of 
this existing code is in Fortran. 


Approach 

The PISCES project approach is to build a series of "testbed" programming environments 
to support the evaluation of a large range of parallel architectures. The conceptual basis of the 
PISCES environments lies in a focus on careful definition of the underlying "virtual machine" 
provided by the PISCES system for the user. The goal is to implement the same virtual machine 
on each target architecture, by porting the PISCES software to each machine. 

To experiment with Fortran extensions for expressing parallelism, the PISCES environ- 
ments are designed so that new Fortran extensions can be easily implemented. To experiment 
with the porting of existing NASA Fortran-based codes, the PISCES environments are designed 
to allow Fortran codes to be run on various parallel architectures with minimal change. The 
PISCES environments provide support for automatic collection of performance data, so that per- 
formance of large systems may be easily monitored and measured on different architectures. 
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Major Results 


Two PISCES environments have been designed and implemented during this grant. The 
PISCES 1 environment was implemented during 1984-85 on a DEC VAX uniprocessor under 
the UNIX operating system. PISCES 1 was later ported to a network of Apollo workstations. 
The PISCES 2 environment was implemented during 1986-87 for the NASA Flexible FLEX/32 
parallel computer. Work on the PISCES 2 system and its applications is continuing through the 
Institute for Computer Applications in Science and Engineering (ICASE) at NASA Langley 
Research Center. A new formal model of parallel computation has been developed as part of the 
Ph.D. research of P.D. Stotts. 


Summary of Research Activities 

The major tasks performed during this research grant have been the following. 


The PISCES 1 Parallel Programming Environment 

The first major phase involved the design and implementation of the PISCES 1 parallel pro- 
gramming environment [2,7]. The major parts of this activity were the following: 

1. Design of the PISCES I environment. Building on some design work at ICASE that pre- 
ceded the grant period, we successfully completed the design of a PISCES 1 parallel program- 
ming environment, based on Fortran 77 and UNIX. The environment provides parallelism to the 
programmer at two different "granularities" of operation: "clusters" and "tasks". The parallel 
"virtual machine" defined by the PISCES software was carefully specified in the User’s Manual 
for the system [2]. The virtual machine was made visible to the programmer as a set of exten- 
sions to Fortran 77 for controlling the operation of the virtual machine (the run-time environ- 
ment for program execution). 

2. Implementation of PISCES 1 on a uniprocessor. With the help of two University of Vir- 
ginia graduate students, Nancy Fitzgerald and Jeff Taylor, who spent Summer 1984 at NASA 
Langley, a prototype implementation of PISCES 1 was completed for the DEC VAX under the 
UNIX operating system. This uniprocessor implementation provides simulated parallelism by 
using UNIX processes, which time-share the CPU. The programmer writes programs in Pisces 
Fortran (Fortran 77 with parallel extensions). Pisces Fortran programs are translated by a 
preprocessor into standard Fortran 77. A run-time library implements the various parts of the 
PISCES 1 virtual machine on the VAX. A complete record of the initiation and communication 
of parallel tasks is left on a set of files after a run. A set of analysis programs may then be used 
to collect summary statistics on program performance [3]. The PISCES 1 system was made 
available to users at the University of Virginia and ICASE. Several substantial applications 
were programmed for the system (described below). 

3. Apollo workstation network implementation of the PISCES 1 environment. The PISCES 
1 system was successfully ported to a local network of Apollo workstations at the University of 
Virginia in 1985 by Nancy Fitzgerald. The network consisted of 10 Apollo DN 300 
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workstations connected by an Ethemet-Iike local network. The system implements the same 
PISCES 1 virtual machine as the VAX implementation, so that applications programs may run 
on either system without change in the code. The programmer sees the workstation network as a 
single clustered parallel machine. The programmer may distribute parallel applications tasks to 
these clusters by using Pisces Fortran commands. 

In constructing this implementation no attempt was made to optimize performance; we 
used the same basic implementation strategy as on the VAX. After implementation was com- 
plete, Fitzgerald studied the basic performance of the system in her M.S. thesis [5]. The basic 
performance parameters are the time to initiate a task and the time to send a message. Due to the 
amount of disk access involved in both activities on the Apollos, both task initiation and mes- 
sage passing were found to be unacceptable slow for realistic applications. To improve this per- 
formance we would have to change the operating system on the Apollos, which was not possible 
at the time. As a result, we did not pursue this implementation. 

4. Applications of the PISCES 1 environment. Two applications of the PISCES 1 to prob- 
lems of interest to NASA were the following. Neither was supported directly by this grant, but 
both illustrate the potentially broad application of the PISCES design. 

(a) Sparse matrix iterative equation solver. Prof. Merrell Patrick of Duke University, in a 
collaborative effort with this project, designed a series of six variations on a general parallel 
solution package for sparse matrix representations of linear systems of equations. These six 
variations were implemented in Pisces Fortran and were run on the VAX implementation of 
PISCES 1 at ICASE. The variations included a data flow version, two chaotic versions, and 
various forms of communication (direct, broadcast, etc.) between the parallel tasks. This work is 
reported in [4,9]. 

(b) Dynamic scene analysis system. Prof. Worthy Martin and Ph.D. candidate Chew-lim 
Tan of the U.Va. Computer Science Department used the PISCES l system for an application of 
artificial intelligence techniques to an image processing problem: the tracking of moving objects 
in dynamically changing scenes. Parallel tasks are used to locate and track different objects 
within the scene, which is continuously changing as new inputs arrive from the camera. This 
work has led to a Ph.D. thesis, completed by Tan in May 1986. 


The PISCES 2 Parallel Programming Environment 

The second major phase of the PISCES project involved the design and implementation of 
a second parallel programming environment, this time for the NASA Flexible FLEX/32 parallel 
computer [13,14]. The major parts of this actvity were the following: 

1. Design of the PISCES 2 environment. The PISCES 2 system provides a rich environ- 
ment for experimenting with parallel programming concepts. The main concepts are: 

(a) The virtual machine is relatively independent of the underlying architecture, so that pro- 
grams in Pisces Fortran are not written using constructs peculiar to one parallel architecture. 

(b) The virtual machine provides several granularities of parallel operation. In PISCES 2 
parallel operation is provided between large-grain "clusters of tasks" and "tasks within a cluster" 
and medium-grain loop iterations and arbitrary program segments. 
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(c) The virtual machine architecture is "clustered", with each cluster representing a group 
of processing and memory resources of the underlying machine. The cluster organization is 
static within a single run, but it may be varied between runs. 

(d) The operating system is represented as a static set of tasks that run within each cluster 
of the virtual machine. User tasks invoke operating system functions by sending messages to 
these operating system "controller" tasks. 

(e) User programs are represented at the top-level by a set of tasks that are dynamically ini- 
tiated and terminated during a run. Tasks communicate using asynchronous message passing. 
Message passing queues are infinite (to available memory), and the receiving task may "accept" 
a message in a variety of different ways, or never. 

(f) Many of the "FORCE" constructs developed by Harry Jordan of the University of 
Colorado are included in PISCES Fortran. These constructs provide medium-grain parallelism, 
including parallel execution of loop iterations, subroutine calls, and arbitrary program segments. 
Synchronization mechanisms include barriers and critical regions with lock variables. Commun- 
ication is via shared variables in COMMON blocks. 

(g) The programmer controls the mapping of hardware resources to the "clusters" of the 
PISCES 2 virtual machine. Each run of a program may be configured differently. Currently the 
mapping consists of assignment of a group of processors to each cluster for running tasks and 
"forces". 

2. Implementation of PISCES 2 on the NASA FLEXI32. Implementation of the PISCES 2 
design was completed in June 1987. The PISCES 2 system as implemented on the NASA 
FLEX/32 consists of several software components: 

(a) Preprocessor. The extensions to Fortran 77 that form the Pisces Fortran language are 
implemented with a preprocessor that converts Pisces Fortran to standard Fortran 77, which is 
then compiled with the FLEX Fortran compiler. The preprocessor converts many parallel con- 
structs to in-line Fortran code. The core parallel constructs are implemented with a small run- 
time library of routines called from the Fortran code. 

(b) Configuration environment. Before each individual run of a parallel program, the user 
works in the PISCES "configuration environment" to choose the configuration options for that 
particular run. These options include the choice of the number of clusters and the cluster 
numbering, choice of the mapping of hardware resources to these clusters, creation of loadfiles 
for downloading to hardware PE’s, and choice of run-time options such as time limits, tracing 
options, etc. These options are chosen with a series of menus and prompts. The user does not 
need to know FLEX operating system commands for this activity. 

Configurations may be saved in files for later editing/reuse in other runs of the same or dif- 
ferent programs. Thus a user can build a library of configurations for use in comparative perfor- 
mance evaluation studies. 

(c) Execution environment. When the user has finished choosing the configuration, execu- 
tion of a parallel run may be initiated from the configuration environment. When the program 
begins execution on the FLEX parallel processors, a PISCES "execution environment" provides 
a menu of commands that allow the user to interact in real-time with the running program. 
Options include the ability to initiate or kill running tasks, send messages to tasks, monitor exe- 
cution in real-time, save trace data about key events in trace files, and view the system state 
(including detailed information about storage management and message queue contents.) 
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(d) Postprocessing of trace files. Trace information saved during a run may be processed 
off-line after the run to obtain detailed information for timing and debugging parallel program 
components. 

The PISCES 2 system is available on the CSM FLEX/32 to NASA users through the LaRC 
network. A short course for users of the system was given in April 1987 at LaRC, ICASE, and 
the University of Virginia. 


New Formal Model of Parallel Computation 

A new formal model of concurrent computation has been developed by P.D. Stotts in a 
Ph.D. thesis [8]. The model is based on the mathematical system known as "H-graph semantics" 
(developed by the principal investigator) together with a "timed Petri net" model of the parallel 
aspects of a system. Stotts has shown that the model can be used for timing analyses of hierarch- 
ical parallel systems and for automatic detection and resolution of conflicts in concurrent access 
to shared data. We hope to use an extension of this model as the basis for a formal definition of 
the PISCES 2 virtual machine, which would be potentially of value to the PISCES implementor 
and user. A paper on this work was presented at a workshop in Europe in summer 1985 [6]; a 
journal paper has been accepted for publication [15]. 


Continuing Activities 

Although the University of Virginia grant has terminated, the PISCES project is continuing 
under NASA sponsorship as an ICASE project. The major short-term continuing activities are 
the following: 

1. Performance measurement and performance tuning of the PISCES 2 system. The 
PISCES 2 system has been fully implemented on the FLEX/32, but there is still much to be 
learned about the performance of this software system for typical parallel algorithms of interest 
to NASA. Initial comparative measurements of performance by Prof. Merrell Patrick and Mark 
Jones of Duke University on a Choleski factorization algorithm indicate that PISCES 2 provides 
performance that is superior to that provided by the Concurrent Fortran system provided by 
Flexible Corp. for their own FLEX/32 machine. More extensive studies of PISCES 2 perfor- 
mance are currently underway. After we understand where the performance bottlenecks of the 
system occur, additional "tuning" of the system to improve performance is anticipated. 

2. Implementation of NICE/SPAR as a PISCES 2 program. The CSM NICE/SPAR testbed 
system for structural analysis is being modified for parallel operation under PISCES 2. The 
SPAR modules will be implemented as Pisces Fortran "tasks". The command stream and data 
base actions processed by NICE will be modified to allow parallel operation. Only minimal 
change to the system will be made, on the order of what might reasonably be expected in moving 
any large software system into a parallel environment. The performance speedup of the parallel 
version of NICE/SPAR will be measured after implementation. With this system we hope to 
better understand the problems of moving large NASA Fortran-based software systems into a 
parallel environment, and what performance gains might realistically be expected. 
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3. Implementation of new parallel algorithms. Two existing PISCES 1 applications pro- 
grams, the sparse matrix package of Patrick and the dynamic scene analysis package of Martin 
and Tan, are being adap ted for PISCES 2 on the FLEX/32. PISCES 2 provides a rich environ- 
ment for experimentation with new parallel algorithms, and we plan to implement a number of 
additional new algorithms, working in collaboration with other NASA, ICASE, and university 
researchers. 

4. PISCES 3 for the NCUBE and Intel hypercubes. A new parallel processing research 
center has been established at the University of Virginia, called the Virginia Institute for Parallel 
Computation (VIPC). Its initial focus will be on software systems and applications for hyper- 
cube architecture parallel computers. VIPC is installing large-scale hypercube parallel machines 
from NCUBE and Intel. The PISCES 2 system is being modified and extended for these hyper- 
cube machines. The modified system is named PISCES 3. When available in 1988, PISCES 3 
will provide a third environment for experimentation and evaluation of parallel architecures. 
This work is being funded by non-NASA sources. 


Conclusions 

The PISCES project is an open-ended project that is expected to continue for several more 
years. Excellent progress was made dining the period of this grant, which funded the initial 
phases of the project. The PISCES software developed under this grant and installed on the 
NASA FLEX/32 provides a rich base for further development, experimentation, and evaluation. 
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